50 research outputs found

    Workshop 1 - Omics Data Workshop

    Get PDF
    Participants will gain hands-on experience with these analyses using tools for pattern discovery in multi-omics. Interspersed with lecture content, attendees will work through multi-omics analysis exercises with real data. Participants are strongly encouraged to bring their own data and study examples for application. Open to computational biologists, bioinformaticians, principal investigators, and their research teams including advanced Ph.D. students. Basic familiarity with multi-omics upstream bioinformatics tools are recommended. Beginner-level familiarity with R is required. Methodological advancements paired with measured multi-omics data using high-throughput technologies enable capturing comprehensive snapshots of biological activities. In particular, low-cost, culture-independent omics profiling has made metagenomics, metabolomics, and proteomics (ā€œmulti-omicsā€) surveys of human health, other hosts, and the environment. The resulting data have stimulated the development of new statistical and computational approaches to analyze and integrate omics data, including human gene expression, microbial gene products, metabolites, and proteins, among others. Multi-omics data generated from diverse platforms are often fed into generic downstream analysis software without proper appreciation of the inherent data differences, which could result in incorrect interpretations. Further, there are also a large collection of downstream analysis software platforms and appropriately selecting the best tool can be extraneous for untrained researchers. In this workshop, we will thus present a high-level introduction to computational multi-omics, highlighting the state-of-the-art in the field as well as outstanding challenges geared towards downstream analysis methods. This will include an introduction to the biological goals of typical multi-omics studies and the statistical methods currently available to achieve them

    Rateless Codes with Optimum Intermediate Performance

    Full text link
    Abstractā€”In this paper, we design several degree distributions for rateless codes with optimum intermediate packet recovery rates. In rateless coding, the employed degree distribution signif-icantly affects the packet recovery rate. Each degree distribution is designed based on the number of message packets, k, and desired coding overhead, Ī³, which is the ratio of the number of received packets, n, to k, i.e., Ī³ = n k Previously designed degree distributions are tuned for full recovery of the entire source packets for Ī³ā€™s slightly larger than 1, and as a consequence, they show very small packet recovery rates for Ī³ < 1. Hence, finding degree distributions with maximal packet recovery rates in intermediate range, 0 < Ī³ < 1, is of interest. We define packet recovery rates at three values of Ī³ as our conflicting objective functions and employ NSGA-II multi-objective genetic algorithms optimization method to find several degree distributions with optimum packet recovery rates. We propose degree distributions for both cases of finite and infinite (asymptotic) k. I

    Multivariable association discovery in population-scale meta-omics studies.

    Get PDF
    It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses generalized linear and mixed models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g., counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2\u27s linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel diseases (IBD) across multiple time points and omics profiles
    corecore